Influence of Conditional Independence Assumption on Verb Subcategorization Detection

نویسندگان

  • Katia Kermanidis
  • Manolis Maragoudakis
  • Nikos Fakotakis
  • George K. Kokkinakis
چکیده

Learning Bayesian Belief Networks from corpora has been applied to the automatic acquisition of verb subcategorization frames for Modern Greek (MG). We are incorporating minimal linguistic resources, i.e morphological tagging and phrase chunking, since a general-purpose syntactic parser for MG is currently unavailable. Comparative experimental results have been evaluated against Naive Bayes classification, which is based on the conditional independence assumption along with two widely used methods, Log-Likelihood (LLR) and Relative Frequencies Threshold (RFT). We have experimented with a balanced corpus in order to assure unbiased behaviour of the training model. Results have depicted that obtaining the inferential dependencies of the training data could lead to a precision improvement of about 4% compared to that of Naive Bayes and 7% compared to LLR and RFT Moreover, we have been able to achieve a precision exceeding 87% on the identification of subcategorization frames which are not known beforehand, while limited training data are proved to endow with satisfactory results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Verb Sense and Verb Subcategorization Probabilities

Roland, Douglas William (Ph.D., Linguistics) Verb Sense and Verb Subcategorization Probabilities Thesis directed by Associate Professor Daniel S. Jurafsky This dissertation investigates a variety of problems in psycholinguistics and computational linguistics caused by the differences in verb subcategorization probabilities found between various corpora and experimental data sets. For psycholing...

متن کامل

Optimum Local Decision Rules in a Distributed Detection System with Dependent Observations

The theory of distributed detection is receiving a lot of attention. A common assumption used in previous studies is the conditional independence of the observations. In this paper, the optimization of local decision rules for distributed detection networks with correlated observations is considered. We focus on presenting the detection theory for parallel distributed detection networks with fi...

متن کامل

Optimum Local Decision Rules in a Distributed Detection System with Dependent Observations

The theory of distributed detection is receiving a lot of attention. A common assumption used in previous studies is the conditional independence of the observations. In this paper, the optimization of local decision rules for distributed detection networks with correlated observations is considered. We focus on presenting the detection theory for parallel distributed detection networks with fi...

متن کامل

Subcategorization acquisition

Manual development of large subcategorised lexicons has proved difficult because predicates change behaviour between sublanguages, domains and over time. Yet access to a comprehensive subcategorization lexicon is vital for successful parsing capable of recovering predicate-argument relations, and probabilistic parsers would greatly benefit from accurate information concerning the relative likel...

متن کامل

How Verb Subcategorization Frequencies Are Affected By Corpus Choice

The probabilistic relation between verbs and their arguments plays an important role in modern statistical parsers and supertaggers, and in psychological theories of language processing. But these probabilities are computed in very different ways by the two sets of researchers. Computational linguists compute verb subcategorization probabilities from large corpora while psycholinguists compute ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001